ProtoNet : Navigating the Hierarchical Clustering of the Protein Space

نویسندگان

  • Ori Sasson
  • Hillel Fleischer
  • Elon Portugaly
  • Yonatan Bilu
  • Nathan Linial
  • Michal Linial
چکیده

The ProtoNet site provides an automatic hierarchical clustering of the protein space. The clustering is based on an all-against-all BLAST similarity test. With this similarity measure we proceed to perform a continuous bottom-up clustering process by applying alternative rules for merging clusters. The outcome of this clustering process is a classification of the input proteins into a hierarchy of clusters of varying degrees of granularity. This clustering can be used for function prediction, for define superfamilies and subfamilies and for large-scale protein annotation purposes. ProtoNet is accessible in the form of an interactive Web site at http://www.protonet.cs.huji.ac.il.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

ProtoNet: hierarchical classification of the protein space

The ProtoNet site provides an automatic hierarchical clustering of the SWISS-PROT protein database. The clustering is based on an all-against-all BLAST similarity search. The similarities' E-score is used to perform a continuous bottom-up clustering process by applying alternative rules for merging clusters. The outcome of this clustering process is a classification of the input proteins into a...

متن کامل

ProtoNet 6.0: organizing 10 million protein sequences in a compact hierarchical family tree

ProtoNet 6.0 (http://www.protonet.cs.huji.ac.il) is a data structure of protein families that cover the protein sequence space. These families are generated through an unsupervised bottom-up clustering algorithm. This algorithm organizes large sets of proteins in a hierarchical tree that yields high-quality protein families. The 2012 ProtoNet (Version 6.0) tree includes over 9 million proteins ...

متن کامل

ProtoNet 4.0: A hierarchical classification of one million protein sequences

ProtoNet is an automatic hierarchical classification of the protein sequence space. In 2004, the ProtoNet (version 4.0) presents the analysis of over one million proteins merged from SwissProt and TrEMBL databases. In addition to rich visualization and analysis tools to navigate the clustering hierarchy, we incorporated several improvements that allow a simplified view of the scaffold of the pr...

متن کامل

Entropy-driven partitioning of the hierarchical protein space

MOTIVATION Modern protein sequencing techniques have led to the determination of >50 million protein sequences. ProtoNet is a clustering system that provides a continuous hierarchical agglomerative clustering tree for all proteins. While ProtoNet performs unsupervised classification of all included proteins, finding an optimal level of granularity for the purpose of focusing on protein function...

متن کامل

Fishing With (Proto)Net—A Principled Approach to Protein Target Selection

Structural genomics strives to represent the entire protein space. The first step towards achieving this goal is by rationally selecting proteins whose structures have not been determined, but that represent an as yet unknown structural superfamily or fold. Once such a structure is solved, it can be used as a template for modelling homologous proteins. This will aid in unveiling the structural ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006